Transforming trees into hedges and parsing with "hedgebank" grammars
نویسندگان
چکیده
Finite-state chunking and tagging methods are very fast for annotating nonhierarchical syntactic information, and are often applied in applications that do not require full syntactic analyses. Scenarios such as incremental machine translation may benefit from some degree of hierarchical syntactic analysis without requiring fully connected parses. We introduce hedge parsing as an approach to recovering constituents of length up to some maximum span L. This approach improves efficiency by bounding constituent size, and allows for efficient segmentation strategies prior to parsing. Unlike shallow parsing methods, hedge parsing yields internal hierarchical structure of phrases within its span bound. We present the approach and some initial experiments on different inference strategies.
منابع مشابه
Balanced Context-Free Grammars, Hedge Grammars and Pushdown Caterpillar Automata
The XML community generally takes trees and hedges as the model for XML document instances and element content. In contrast, Berstel and Boasson have discussed XML documents in the framework of extended context-free grammar, modeling XML documents as Dyck strings and schemas as balanced grammars. How can these two models be brought closer together? We examine the close relatioship between Dyck ...
متن کاملParsing Algorithms for Grammars with Regulated Rewriting
In recent papers [4, 5, 8, 11] Petri net controlled grammars have been introduced and investigated. It was shown that various regulated grammars such as random context, matrix, vector, valence grammars, etc., resulted from enriching context-free grammars with additional mechanisms can be unified into the Petri net formalism, i.e., a grammar and its control can be represented by a Petri net. Thi...
متن کاملTransforming Dependency Structures to LTAG Derivation Trees
We propose a new algorithm for parsing Lexicalized Tree Adjoining Grammars (LTAGs) which uses pre-assigned bilexical dependency relations as a filter. That is, given a sentence and its corresponding well-formed dependency structure, the parser assigns elementary trees to words of the sentence and return attachment sites compatible with these elementary trees and predefined dependencies. Moreove...
متن کاملParsing Algorithms for Regulated Grammars
Petri nets, introduced by Carl Adam Petri [12] in 1962, provide a powerful mathematical formalism for describing and analyzing the flow of information and control in concurrent systems. Petri nets can successfully be used as control mechanisms for grammars, i.e., the generative devices of formal languages. In recent papers [4], [5], [9], [16] Petri net controlled grammars have been introduced a...
متن کاملParsing Tree Adjoining Grammars and Tree Insertion Grammars with Simultaneous Adjunctions
A large part of wide coverage Tree Adjoining Grammars (TAG) is formed by trees that satisfy the restrictions imposed by Tree Insertion Grammars (TIG). This characteristic can be used to reduce the practical complexity of TAG parsing, applying the standard adjunction operation only in those cases in which the simpler cubic-time TIG adjunction cannot be applied. In this paper, we describe a parsi...
متن کامل